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The Impact of Topology on Byzantine Containment in Stabilization 

Swan Dubois*^ Toshimitsu Masuzawa'^ Sebastien Tixeuil^ 



Abstract 



Self-stabilization is an versatile approach to fault-tolerance since it permits a distributed 
system to recover from any transient fault that arbitrarily corrupts the contents of all memories 
J^^l in the system. Byzantine tolerance is an attractive feature of distributed system that permits 

to cope with arbitrary malicious behaviors. 

We consider the well known problem of constructing a maximum metric tree in this context. 
r — ' Combining these two properties prove difficult: we demonstrate that it is impossible to contain 

the impact of Byzantine nodes in a self-stabilizing context for maximum metric tree construction 
(strict stabilization). We propose a weaker containment scheme called topology- aware strict 
stabilization, and present a protocol for computing maximum metric trees that is optimal for 



^H ' this scheme with respect to impossibility result. 



Keywords Byzantine fault, Distributed protocol, Fault tolerance. Stabilization, Spanning tree 
construction 



> 

Q> ; 1 Introduction 



The advent of ubiquitous large-scale distributed systems advocates that tolerance to various kinds of 
•^ ' faults and hazards must be included from the very early design of such systems. Self-stabilization [3l 

m [H] is a versatile technique that permits forward recovery from any kind of transient faults, 
while Byzantine Fault-tolerance [10] is traditionally used to mask the effect of a limited number 
of malicious faults. Making distributed systems tolerant to both transient and malicious faults is 
appealing yet proved difficult [6l [21 [12] as impossibility results are expected in many cases. 

Two main paths have been followed to study the impact of Byzantine faults in the context of 



^ ' self-stabilization: 



Byzantine fault masking. In completely connected synchronous systems, one of the most 
studied problems in the context of self-stabilization with Byzantine faults is that of clock 
synchronization. In [Tl[6], probabilistic self-stabilizing protocols were proposed for up to one 
third of Byzantine processes, while in [H [9] deterministic solutions tolerate up to one fourth 
and one third of Byzantine processes, respectively. 
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• Byzantine containment. For local tasks (i.e. tasks whose correctness can be checked locally, 
such as vertex coloring, link coloring, or dining philosophers), the notion of strict stabilization 
was proposed |121 \T3\ lll| . Strict stabilization guarantees that there exists a containment 
radius outside which the effect of permanent faults is masked, provided that the problem 
specification makes it possible to break the causality chain that is caused by the faults. As 
many problems are not local, it turns out that it is impossible to provide strict stabilization 
for those. 

Our Contribution. In this paper, we investigate the possibility of Byzantine containment in a 
self-stabilizing setting for tasks that are global (i.e. for with there exists a causality chain of size r, 
where r depends on n the size of the network), and focus on a global problem, namely maximum 
metric tree construction (see Oil]). As strict stabilization is impossible with such global tasks, we 
weaken the containment constraint by relaxing the notion of containment radius to containment 
area, that is Byzantine processes may disturb infinitely often a set of processes which depends on 
the topology of the system and on the location of Byzantine processes. 

The main contribution of this paper is to present new possibility results for containing the 
influence of unbounded Byzantine behaviors. In more details, we define the notion of topology- 
aware strict stabilization as the novel form of the containment and introduce containment area 
to quantify the quality of the containment. The notion of topology-aware strict stabilization is 
weaker than the strict stabilization but is stronger than the classical notion of self-stabilization 
(i.e. every topology-aware strictly stabilizing protocol is self-stabilizing, but not necessarily strictly 
stabilizing) . 

To demonstrate the possibility and effectiveness of our notion of topology-aware strict stabiliza- 
tion, we consider maximum metric tree construction. It is shown in [12] that there exists no strictly 
stabilizing protocol with a constant containment radius for this problem. In this paper, we provide 
a topology-aware strictly stabilizing protocol for maximum metric tree construction and we prove 
that the containment area of this protocol is optimal. 

2 Distributed System 

A distributed system S = {P,L) consists of a set P = {vi,V2, ■ ■ ■ ,Vn} of processes and a set L 
of bidirectional communication links (simply called links). A link is an unordered pair of distinct 
processes. A distributed system S can be regarded as a graph whose vertex set is P and whose link 
set is L, so we use graph terminology to describe a distributed system S. 

Processes u and v are called neighbors if (u, v) € L. The set of neighbors of a process v is 
denoted by N^, and its cardinality (the degree of v) is denoted by A^{= \Ny\). The degree A of a 
distributed system S = {P,L) is defined as A = max{A^ | v G P}. We do not assume existence of 
a unique identifier for each process. Instead we assume each process can distinguish its neighbors 
from each other by locally arranging them in some arbitrary order: the k-th neighbor of a process 
V is denoted by Ny{k) {1 < k < A^). The distance between two processes u and v is the length of 
the shortest path between u and v. 

In this paper, we consider distributed systems of arbitrary topology. We assume that a single 
process is distinguished as a root, and all the other processes are identical. 

We adopt the shared state model as a communication model in this paper, where each process 
can directly read the states of its neighbors. 



The variables that are maintained by processes denote process states. A process may take 
actions during the execution of the system. An action is simply a function that is executed in an 
atomic manner by the process. The actions executed by each process is described by a finite set 
of guarded actions of the form (guard) — > (statement). Each guard of process n is a boolean 
expression involving the variables of u and its neighbors. 

A global state of a distributed system is called a configuration and is specified by a product 
of states of all processes. We define C to be the set of all possible configurations of a distributed 
system S. For a process set R (^ P and two configurations p and p', we denote p >-^ p' when p 
changes to p' by executing an action of each process in R simultaneously. Notice that p and p' 
can be different only in the states of processes in R. For completeness of execution semantics, we 
should clarify the configuration resulting from simultaneous actions of neighboring processes. The 
action of a process depends only on its state at p and the states of its neighbors at p, and the result 
of the action refiects on the state of the process at p' . 

A schedule of a distributed system is an infinite sequence of process sets. Let Q = R^,R^, . . .he a 
schedule, where R^ (^ P holds for each i {i > 1). An infinite sequence of configurations e = po, pi, ■ ■ ■ 

is called an execution from an initial configuration po by a schedule Q, if e satisfies pi-i >-^ pi for 
each i {i > 1). Process actions are executed atomically, and we also assume that a distributed 
daemon schedules the actions of processes, i.e. any subset of processes can simultaneously execute 
their actions. 

The set of all possible executions from pQ G C is denoted by Ep^. The set of all possible 
executions is denoted by E, that is, E = IJpGC ^p- ^^ consider asynchronous distributed systems 
where we can make no assumption on schedules except that any schedule is weakly fair: every 
process is contained in infinite number of subsets appearing in any schedule. 

In this paper, we consider (permanent) Byzantine faults: a Byzantine process (i.e. a Byzantine- 
faulty process) can make arbitrary behavior independently from its actions. If v is a Byzantine 
process, v can repeatedly change its variables arbitrarily. 

3 Self- Stabilizing Protocol Resilient to Byzantine Faults 

Problems considered in this paper are so-called static problems, i.e. they require the system to 
find static solutions. For example, the spanning-tree construction problem is a static problem, 
while the mutual exclusion problem is not. Some static problems can be defined by a specification 
predicate (shortly, specification), spec{v), for each process v: a configuration is a desired one (with 
a solution) if every process satisfies spec{v). A specification spec{v) is a boolean expression on 
variables of P„ (C P) where P^ is the set of processes whose variables appear in spec{v). The 
variables appearing in the specification are called output variables (shortly, 0-variables). In what 
follows, we consider a static problem defined by specification spec{v). 

Self-Stabilization. A self- stabilizing protocol ([3]) is a protocol that eventually reaches a legitimate 
configuration, where speciv) holds at every process v, regardless of the initial configuration. Once it 
reaches a legitimate configuration, every process never changes its 0-variables and always satisfies 
spec{v). From this definition, a self-stabilizing protocol is expected to tolerate any number and 
any type of transient faults since it can eventually recover from any configuration affected by the 
transient faults. However, the recovery from any configuration is guaranteed only when every 
process correctly executes its action from the configuration, i.e., we do not consider existence of 



permanently faulty processes. 

Strict stabilization. When (permanent) Byzantine processes exist, Byzantine processes may not 
satisfy spec{v). In addition, correct processes near the Byzantine processes can be influenced and 
may be unable to satisfy spec{v) . Nesterenko and Arora [12] define a strictly stabilizing protocol as 
a self-stabilizing protocol resilient to unbounded number of Byzantine processes. 
Given an integer c, a c- correct process is a process defined as follows. 

Definition 1 (c-correct process) A process is c-correct if it is correct (i.e. not Byzantine) and 
located at distance more than c from any Byzantine process. 

Definition 2 ((c, /)-containment) A configuration p is (c, /)-contained for specification spec if, 
given at most f Byzantine processes, in any execution starting from p, every c-correct process v 
always satisfies spec{v) and never changes its 0-variables. 

The parameter c of Definition [2] refers to the containment radius defined in p2] • The parameter 
/ refers explicitly to the number of Byzantine processes, while [12] dealt with unbounded number 
of Byzantine faults (that is / € {0 . . . n}). 

Definition 3 ((c, /)-strict stabilization) A protocol is (c, f )-stnctly stahilizmg for specification 
spec if, given at most f Byzantine processes, any execution e = po, pi, . . . contains a configuration 
Pi that is (c, f) -contained for spec. 

An important limitation of the model of [12] is the notion of r-restrictive specifications. In- 
tuitively, a specification is r-restrictive if it prevents combinations of states that belong to two 
processes u and v that are at least r hops away. An important consequence related to Byzantine 
tolerance is that the containment radius of protocols solving those specifications is at least r. For 
some problems, such as the spanning tree construction we consider in this paper, r can not be 
bounded to a constant. We can show that there exists no {o{n), l)-strictly stabilizing protocol for 
the spanning tree construction. 

Topology-a'ware strict stabilization. In the former paragraph, we saw that there exist a number 
of impossibility results on strict stabilization due to the notion of r-restrictive specifications. To 
circumvent this impossibility result, we define here a new notion, which is weaker than the strict 
stabilization: the topology-aware strict stabilization (denoted by TA-strict stabilization for short). 
Here, the requirement to the containment radius is relaxed, i.e. the set of processes which may be 
disturbed by Byzantines ones is not reduced to the union of c-neighborhood of Byzantines processes 
but can be defined depending on the topology of the system and on Byzantine processes location. 

In the following, we give formal definition of this new kind of Byzantine containment. From 
now, B denotes the set of Byzantine processes and Sb (which is function of B) denotes a subset of 
V (intuitively, this set gathers all processes which may be disturbed by Byzantine processes). 

Definition 4 (Ss-correct node) A node is SB-coned if it is a correct node (i.e. not Byzantine) 
which not belongs to Sb- 

Definition 5 (Ss-legitimate configuration) A configuration p is S^-legitimate for spec if ev- 
ery SB-correct node v is legitimate for spec (i.e. if spec{v) holds). 

Definition 6 ((5'^, /)-topology-a'ware containment) A configuration pQ is {Sb, f)-topoiogy- 
aware contained for specification spec if, given at most f Byzantine processes, in any execution 
e = po,Pi, ■ ■ ■, every configuration is Ss-legitimate and every SB-correct process never changes its 
0-variables. 



The parameter Sb of Definition [6] refers to the containment area. Any process which belongs 
to this set may be infinitely disturbed by Byzantine processes. The parameter / refers explicitly 
to the number of Byzantine processes. 

Definition 7 ((5^, /)-topology-aware strict stabilization) A protocol is (Ss, /)-topology- 
aware strictly stabilizing for specification spec if, given at most f Byzantine processes, any execution 
e = po,Pi, ■ ■ ■ contains a configuration pi that is {Sb, f) -topology- aware contained for spec. 

Note that, if B denotes the set of Byzantine processes and Sb = {v ^ V\min{d{v,h),h G 
-B} < c}, then a (S^, /)-topology-aware strictly stabilizing protocol is a (c, /)-strictly stabilizing 
protocol. Then, a TA-strictly stabilizing protocol is generally weaker than a strictly stabilizing one, 
but stronger than a classical self-stabilizing protocol (that may never meet its specification in the 
presence of Byzantine processes) . 

The parameter Sb is introduced to quantify the strength of fault containment, we do not require 
each process to know the actual definition of the set. Actually, the protocol proposed in this paper 
assumes no knowledge on this parameter. 

4 Maximum Metric Tree Construction 

In this work, we deal with maximum (routing) metric trees as defined in [8] (note that [7] provides 
a self-stabilizing solution to this problem) . Informally, the goal of a routing protocol is to construct 
a tree that simultaneously maximizes the metric values of all of the nodes with respect to some 
total ordering -<. In the following, we recall all definitions and notations introduced in [8]. 

Definition 8 (Routing metric) A routing metric (or just metric^ is a five-tuple {M,W,met,mr, 
-<) where: 

1. M is a set of metric values, 

2. W is a set of edge weights, 

3. met is a metric function whose domain is M xW and whose range is M , 

4- mr is the maximum metric value in M with respect to -< and is assigned to the root of the 
system, 

5. -< is a less-than total order relation over M that satisfies the following three conditions for 
arbitrary metric values m, m' , and m" in M : 

(a) irreflexivity: m -/( m, 

(h) transitivity : if m ~< m' and m' -< m" then m -< m" , 

(c) totality: m ~< m' or m' -< m or m = m' . 

Any metric value m ^ M\ {mr} satisfies the utility condition (that is, there exists wq, . . . , Wk-i in 
W and rriQ = mr,mi, . . . ,mk-i,rnk = m in M such that Mi G {1, . . . , k},mi = met{mi-i,Wi-i)). 

For instance, we provide the definition of three classical metrics with this model: the shortest 
path metric (SV), the flow metric (J-), and the reliability metric (TZ). 



SV = {Mi,Wi,meti,mri,-<i) T = {M2,W2,met2,mr2,-<2) 

where Mi = N where mr2 G N 

Wi=N M2 = {0, ...,mr2} 

meti {m,w) = m + w W2 = {0, . . . , mr2 } 

mri = Tnet2{m,w) = min{m,w} 

-<i is the classical > relation -<2 is the classical < relation 

n = (M3,W3,met3,mr3,^3) 
where M3 = [0, 1] 

Ws = [0, 1] 
met^{m, w) = m* w 
mr-^ = 1 
^3 is the classical < relation 

Definition 9 (Assigned metric) An assigned metric over a system S is a six-tuple (M, W, met, 
mr, -<,wf) where (M, W, met, mr, -<) is a metric and wf is a function that assigns to each edge of 
S a weight in W . 

Let a rooted path (from v) be a simple path from a process v to the root r. The next set of 
definitions are with respect to an assigned metric {M,W,met,mr, -<,wf) over a given system S. 

Definition 10 (Metric of a rooted path) The metric of a rooted path in S is the prefix sum, 
of met over the edge weights in the path and mr. 

For example, if a rooted path p in S is v^, • • • , ^o with vq = r, then the metric of p is m^ = 
met{mk-i,wf{{vk,Vk-i}) with Vi e {l,k - l},mi = met{mi-i,wf{{vi,Vi-i}) and mo = mr. 

Definition 11 (Maximum metric path) A rooted path p from v in S is called a maximum 
metric path with respect to an assigned metric if and only if for every other rooted path q from v 
in S, the metric of p is greater than or equal to the metric of q with respect to the total order -<. 

Definition 12 (Maximum metric of a node) The maximum metric of a node v ^ r (or simply 
metric value of v) in S is defined by the metric of a maximum metric path from v. The maximum 
metric of r is mr. 

Definition 13 (Maximum metric tree) A spanning tree T of S is a maximum metric tree with 
respect to an assigned metric over S if and only if every rooted path in T is a maximum metric 
path in N with respect to the assigned metric. 

The goal of the work of [8] is the study of metrics that always allow the construction of a 
maximum metric tree. More formally, the definition follow. 

Definition 14 (Maximizable metric) A metric is maximizable if and only if for any assign- 
ment of this metric over any system S, there is a maximum metric tree for S with respect to the 
assigned metric. 

Note that [7j provides a self-stabilizing protocol to construct a maximum metric tree with 
respect to any maximizable metric. Moreover, [8] provides a fully characterization of maximizable 
metrics as follow. 

Definition 15 (Boundedness) A metric (M, W, met, mr, -<) is bounded if and only if: Vm G 
M,\/w € W,met{m,,w) -< m or met{m,w) = m 
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Figure 1: Examples of containment areas for SP spanning tree construction. 

Definition 16 (Monotonicity) A metric {M,W,met,mr,~<) is monotonic if and only if: \/{m, 
ni') G M'^,\/w ^W,m ^m' ^ {met{m,w) -< niet{m',w) or met{m,w) = met{m',w)) 

Theorem 1 (Characterization of maximizable metrics [8]) ^ metric is maximizahle if and 
only if this metric is hounded and monotonic. 

Given a maximizable metric M = {Ad,W,mr,met,~<), the aim of this work is to construct a 
maximum metric tree with respect to A4 which spans the system in a self-stabilizing way in a 
system subject to permanent Byzantine failures. It is obvious that these Byzantine processes may 
disturb some correct processes. It is why, we relax the problem in the following way: we want to 
construct a maximum metric forest with respect to A4. The root of any tree of this forest must be 
either the real root or a Byzantine process. 

Each process v has three 0-variables: a pointer to its parent in its tree {prnt^ G N^ U {-L}), a 
level which stores its current metric value (levely £ M), and a variable which stores its distance to 
the root of its tree {dist^ € {0, . . . , -D}). Obviously, Byzantine process may disturb (at least) their 
neighbors. We use the following specification of the problem. 

We introduce new notations as follows. Given an assigned metric (M, W,met,mr, -<,wf) over 
the system S and two processes u and v, we denote by //(«, v) the maximum metric of node u when 
V plays the role of the root of the system and by Wu,v the weight of the edge {u, v} (that is, the 
value of wf{{u, v})). 

Definition 17 (A4-path) Given an assigned metric Ai = {M,W,mr,met,~<,wf) over a system 
S, a path {vq, . . . ,Vk) (k > 1) of S is a A^-path if and only if: 

1. prntyg = _L, level^Q = 0, dist^^ = 0, and vq & B L) {r}, 

2. Vi G {1, . . . ,k},prnty^ = Uj_i, level^^ = met{levelv^_-^,ujy.^y^_-^), and dist^^ = i, 
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Figure 2: Examples of containment areas for flow spanning tree construction. 

3. Vi G {1, . . . , k}, met{levely^_^ , Wy^^y^_-^) = max^{met{levelu, w^^^u)} , cind 

ueNy 

4. levely,^ = n{vk,vo). 

We define the specification predicate spec{v) of the maximum metric tree construction with 
respect to a maximizable metric Ai as follows. 

I prnty = ±, levely = 0, and dist^ = if v is the root r 
specyv) : < 

I there exists a A^-path {vo, . . . ,Vk) such that Vk = v otherwise 

Following discussion of Section [31 it is obvious that there exists no strictly stabilizing protocol 
for this problem. It is why we consider the weaker notion of topology-aware strict stabilization. 
First, we show an impossibility result in order to define the best possible containment area. Then, 
we provide a maximum metric tree construction protocol which is (5^, /)-TA-strictly stabilizing 
where / < n — 1 which match these optimal containment area, namely: 



Sb = {v€V\B |/i(u, r) ^ max^{n{v, b),beB}}\{r} 

Figures from [1] to [3] provide some examples of containment areas with respect to several maxi- 
mizable metrics. 

We introduce here a new definition that is used in the following. 

Definition 18 (Fixed point) A metric value m is a fixed point of a metric M = (M, W, mr, met, 
-<) if m G M and if for any value w G W, we have: met{m, w) = m. 
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Figure 3: Examples of containment areas for reliability spanning tree construction. 

4.1 Impossibility Result 

In this section, we show that there exists some constraints on the containment area of any topology- 
aware strictly stabilizing for the maximum metric tree construction depending on the metric. 

Theorem 2 Given a maximizable metric A4 = {M,W,mr,met,~(), even under the central dae- 
mon, there exists no {A b,1)-TA- strictly stabilizing protocol for maximum metric spanning tree 
construction with respect to Ad where Ab ^ Sb- 

Proof Let M = {M,W,mr,met,~<) be a maximizable metric and "P be a (A^, l)-TA-strictly 
stabilizing protocol for maximum metric spanning tree construction protocol with respect to M 
where Ab ^ Sb- We must distinguish the following cases: 

Case 1: \M\ = 1. 

Denote by m the metric value such that M = {m}. For any system and for any process v ^ r, 
we have fi{v, r) = min^{fi{v, b)} = m. Consequently, Sb = V \{B U {r}) for any system. 

beB 

Consider the following system: V = {r, u, v, b} and E = {{r, u}, {n, v}, {v, b}} {b is a Byzan- 
tine process). As Sb = {u,v} and Ab $! Sb, we have: u ^ Ab or -y ^ Ab- Con- 
sider now the following configuration pg: prntr = prnt^ = _L, prnty = 6, prnt^ = v, 
levelr = levelu = levels = levelb = in, distr = distb = 0, dist^ = 1 and distu = 2 (see 
Figure HI other variables may have arbitrary values) . Note that pg is ^l^-legitimate for spec 
(whatever Ab is). 

Assume now that b behaves as a correct process with respect to V. Then, by convergence of 
V in a fault-free system starting from pg which is not legitimate (remember that a strictly- 
stabilizing protocol is a special case of self-stabilizing protocol), we can deduce that the system 
reaches in a finite time a configuration p5 (see Figure H]) in which: prntr = -L, prntu = r, 
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Figure 4: Configurations used in proof of Theorem [2l 

prnty = u, prntb = v, levelr = levelu = levels = level}, = m, distj. = 0, distu = 1, disty = 2 
and disti, = 3. Note that processes u and w modify their 0-variables in this execution. This 
contradicts the {Ab, l)-TA-strict stabilization of V (whatever Ab is). 

Case 2: \M\ > 2. 

By definition of a bounded metric, we can deduce that there exist m ^ M and w ^W such 
that m = met{mr, w) -< mr. Then, we must distinguish the fohowing cases: 

Case 2.1: m is a fixed point of A4. 

Consider the following system: V = {r, u, v, 6}, E = {{r, u}, {u, v}, {v, b}}, Wr,u = Wy^h = 
ui, and Wu,v = w' (5 is a Byzantine process). As for any w' G W, met{m,w') = m (by 
definition of a fixed point), we have: Sb = {u,v}. Since Ab $! Sb, we have: u ^ Ab or 
V ^ Ab- Consider now the following configuration Pq: prntr = prntb = -L, prnty = b, 
prntu = V, levelr = levels = mr., levelu = levely = m, distr = distb = 0, disty = 1 
and disty = 2 (see Figured! other variables may have arbitrary values). Note that pg is 
^B-legitimate for spec (whatever Ab is). 

Assume now that b behaves as a correct process with respect to V. Then, by convergence 
of "P in a fault-free system starting from p^ which is not legitimate (remember that a 
strictly-stabilizing protocol is a special case of self-stabilizing protocol), we can deduce 
that the system reaches in a finite time a configuration p\ (see Figure H]) in which: 
prntr = -L, prnty = r, prnty = u, prntb = v, levelr = mr, levely = levely = levelb = rn 
(since m is a fixed point), distr = 0, disty = 1, disty = 2 and distb = 3. Note 
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that processes u and v modify their 0-variables in this execution. This contradicts the 
{Ab, l)-TA-strict stabihzation of V (whatever As is). 

Case 2.2: m is not a fixed point of M. 

This implies that there exists w' &W such that: met{m, w') -< m (remember that A4 is 
bounded). Consider the following system: V = {r, u, v, v' , b}, E = {{r, u}, {u, v}, {u, v'}, 
{v,b},{v' ,b}}, Wr^u = Wy^h = Wyi^b = w, and Wu,v = Wuy = w' ib is a Byzantine 
process). We can see that Sb = {v-,v'}. Since As ^ Sb, we have: v ^ Ab or v' ^ Ab- 
Consider now the following configuration pg: prntr = prnt;, = _L, prnt^ = prntyi = b, 
prntu = r, levelr = levels = mr, levelu = levely = levelyi = m, distr = disti, = 0, 
disty = distyi = 1 and distu = 1 (see Figure HI other variables may have arbitrary 
values). Note that p^ is A^-legitimate for spec (whatever Ab is). 

Assume now that b behaves as a correct process with respect to V. Then, by convergence 
of P in a fault-free system starting from p^ which is not legitimate (remember that a 
strictly-stabilizing protocol is a special case of self-stabilizing protocol), we can deduce 
that the system reaches in a finite time a configuration p\ (see Figure H]) in which: 
prntr = -L, prntu = ^i pmty = prntyi = u, prnt^ = v (or prntb = v'), levelr = "ir, 
levelu = ITT' levely = levelyi = met{m,w') = m', levelb = met{m',w) = m", distr = 0, 
distu = 1, disty = distyi = 2 and dist^ = 3. Note that processes v and v' modify their 
0-variables in this execution. This contradicts the (^b> l)-TA-strict stabilization of V 
(whatever Ab is). 

D 

4.2 Topology-Aware Strict Stabilizing Protocol 

In this section, we provide our self-stabilizing protocol that achieve optimal containment areas to 
permanent Byzantine failures for constructing a maximum metric tree for any maximizable metric 
M = (M, W, met, mr, -<). More formally, our protocol is {Sb, /)-strictly stabilizing, that is optimal 
with respect to the result of Theorem [2j Our protocol is borrowed from the one of (which is 
self-stabilizing). The key idea of this protocol is to use the distance variable (upper bounded by 
a given constant D) to detect and break cycles of process which has the same maximum metric. 
The main modification we bring to this protocol follows. In the initial protocol, when a process 
modifies its parent, it chooses arbitrarily one of the "better" neighbors (with respect to the metric). 
To achieve the (5*5, /)-TA-strict stabilization, we must ensures a fair selection along the set of its 
neighbor. We perform this fairness with a round-robin order along the set of neighbors. Our 
solution is presented as Algorithm 14. 1[ 

In the following, we provide the proof of the TA-strict stabilization of SSA4AX. Remember 
that the real root r can not be a Byzantine process by hypothesis. Note that the subsystem whose 
set of nodes is V \ Sb is connected respectively by boundedness of the metric. 

Lemma 1 For any process v € V , we have: 

Vn G Ny,met max^ {p{u,p)},Wu,v ^ max^ {/^(^)P)} 

\pe-BU{r} ' J peBU{r} 
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algorithm 4.1 SSA4AX: A TA-strictly stabilizing protocol for maximum metric tree construction. 



Data: 

A''^,: totally ordered set of neighbors of v. 

D: upper bound of the number of processes in a simple path. 

Variables: 

{= ± if II = r 
: pointer on the parent of v in the tree. 
e Ny if V j^ r 

levels G {m S M\m ^ mr}: metric of the node. 
disty £ {0, . . . , D}: distance to the root. 



Macro 
Rules 



y. 

For any subset A C A'^„, choose{A) returns the first element of A which is bigger than prnty (in a round-robin fashion). 



(Rr) :: {v = r) A ({levelv ^ mr) V (distv 7^ 0)) — > levelv := mr; distv := 

(iti) :: (ii 7^ r) A (prnti, £ Ny) A {{distv ^ min{distprnt^ + I, D)) \/ {levelv ^ m.et{levelprnt^,Wv^prnt^))) 
— > disty := min{distprnti, + 1, D);levely := met{levelprnty,Wv,prnt^) 

{R2) :: {v y^r) A {disty = D) A {3u G Ny,disty < D ~ 1) 

— >■ prnty := choose{{u G Ny\disty < D — 1}); disty := distpmt^ + 1; levely := met{levelprnty , Wy,prnty) 

(Rs) ■■■■ {v j^r) A (3n G Ny, {disty < D — 1) A {levely -< met{levelu,Wu,y))) 

— > prnty := choose { < u £ NyUlevely <D — l)A{met{levelu,Wy^y)= max^ {met{levelq,Wq^y)~\)\ J: 

\[ ' ' qeN^/leyel,<D-l ' ]) 

levely := met{levelprnt^ , vjpmty ,y); disty := distpmty + 1 



Proof Let f E F be a process. By contradiction, assume that there exists a neighbor u of f such 
that: 

max^ {/i(u,p)} -< met max^ {fi{u,p)},Wu^y 

p£BU{r} \peBU{r} ' J 

Let q & B L) {r} one of the process such that max^ {a*(^)P)} = /^(^) q)- Then, we have: 

peBU{r} 

max^ {lj,{v,p)} -< 7net{iJ,{u,q),Wu,v) by construction of g 

pG-BU{r} 

^ /x(f,g) since 'met{n{u,q),Wu,v) ^ Kv,q) 

This contradicts the fact that q (^ B L) {r} and shows us the result. D 

Given a configuration p ^ C and a metric value m € M, let us define the following predicate: 

IMm{p) = Vii S V, levely :< max^ < m, max^ {fi{v,u)} > 

[ u€BU{r} J 

Lemma 2 For any metric value m € M, the predicate IMm is closed by actions of SSAiAX. 

Proof Let tti be a metric value {m G M). Let p G C be a configuration such that IMm{p) = true 

and /o' € C be a configuration such that p i— )■ p' is a step of SSA4AX. 

If the root process r £ R (respectively a Byzantine process b £ R), then we have level,. = mr 
(respectively levelb ^ mr) in p' by construction of (Rr) (respectively by definition of levels)- Hence, 
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levelr ^ max^ < m, max^ {/u(r, u)} > = mr (respectively levelh :< max^ < m, max^ {fi{b,u)} > ^ 

[ ueBU{r} J [ u€BU{r} J 

mr). 

If a correct process v & R with v ^ r, then there exists a neighbor p of t; such that levelp ■< 

max^ < 771, max^ {fi{p,u)} > in p (since IM{p) = true) and prnty = p and levels = met{levelp, 

[ ueBU{r} J 

Wjj^p) in /o' (since v is activated during this step). 

If we apply the Lemma [1] to met and to neighbor p, we obtain the following property: 

met max^ {fi{p,u)},Wv,p ^ max^ {fi{v,u)} 

\ueBU{r} ' J u£BU{r} 

Consequently, we obtain that, in p': 
levely = met{levelp,Wv^p) 

:< met max^ < m, max^ {fi{p,u)} > ,Wyp\ by boundedness of M. 

\ [ «GiJU{r} J ' J 

:< max^ < met{m,Wv^p),met max^ {p{p,u)},Wy^p > 

I ' \MGBU{r} ' / J 

:< max^ I m, max^ {^(f,n)} > since met{m,Wy^p) :< m 

[ ueBU{r} J 

We can deduce that IMci{p') = true, that concludes the proof. D 

Given an assigned metric to a system G, we can observe that the set of metrics value M is 

finite and that we can label elements of M by mg = mr, mi, . . . ,7Tifc in a way such that Wi € 

{0,... ,k- l},mj+i ^ mi. 

We introduce the following notations: 

Vm-j G M, Prrn = {v £ V \ 5'B|/i(v,r) = mi} 

ymi^M, Vm, = \JPm, 

3=0 

yrrii E M, 1^^ = {v (^ V\ max^ {/i(u,u)} -< mi} 

uGBU{r} 

\/mieM, LCm, = {peC\{'^veVm„spec{v))A{IMm,{p))} 

Lemma 3 For any rrii G M, the set jCCrm is closed by actions of SSA4AX . 

Proof Let mi be a metric value from M and p be a configuration of CCrm ■ By construction, any 

process v G Vrm satisfies spec{v) in p. 

In particular, the root process satisfies: prntr = -L, levelr = mr, and distr = 0. By construction 
of SSMAX, r is not enabled and then never modifies its 0-variables (since the guard of the rule 
of r does not involve the state of its neighbors) . 

In the same way, any process v £ Vrm satisfies: prnty £ Ny, levely = met{levelprn.tv , Wprnt^,v)-, 

disty = distprnt^ + 1, and levely = max^{met{levelu, Wu,y)} ■ Note that, as v £ Vrm ^^^ spec{v) 

ueNy 
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holds in p, we have: levels = P'{v.,r) = max^ {^(^^)P)} and disty < D — 1 hy construction of D. 

peBU{r} 

Hence, process v is not enabled in p. 

Assume that there exists a process v E Vrm that take a step p' i— ?> p" in an execution starting 
from p (without loss of generality, assume that v is the first process of t; G Vrm ^^^^ takes a step in 
this execution). Then, we know that v ^ r. This activation implies that a neighbor u ^ Vrm (since 

V is the first process of T^. to take a step) of v modified its levelu variable to a metric value m ^ M 
such that levels -< met{m, Wu,v) in p' (note that 0-variables of v and prnt^ remain consistent since 

V is the first process to take a step in this execution). 

Hence, we have levely = max^ {lJ-{v,p)} -< met{m,Wu,v)- Moreover, the closure of 1Mb (es- 

pG-BU{r} 

tablished in Lemma[2]) ensures us that m ^ max^ {p{u,p)}. By boundedness oi AA, we can deduce 

peBU{r} 

that met{m,Wu,v) ^ met{ max^ {p{u,p)},Wu,v)- Consequently, we obtain that max^ {/^(^jP)} ^ 

pGBU{r} ' pG-BU{r} 

met[ max^ {p{u,p)},Wu,v)- This is contradictory with the result of Lemma[TJ 

peSU{r} 

In conclusion, any process v S Vrm takes no step in any execution starting from p and then 
always satisfies spec{v). Then, the closure oi 1Mb (established in Lemma [2|) concludes the proof. 

D 

Lemma 4 Any configuration of CC is {SB,n — 1)-TA contained for spec. 

Proof This is a direct application of the Lemma [3] to CC = CCrm- D 

Lemma 5 Starting from any configuration of C, any execution of SSA4AX reaches in a finite 

time a configuration of CCmr- 

Proof Let p be an arbitrary configuration. Then, it is obvious that IMmr{p) is satisfied. By closure 

of IMmr (proved in Lemma[2]), we know that IMmr remains satisfied in any execution starting from 

P- 

If r does not satisfy spec{r) in p, then r is continuously enabled. Since the scheduling is weakly 
fair, r is activated in a finite time and then r satisfies spec{r) in a finite time. Denote by p' the first 
configuration in which spec{r) holds. Note that r takes no step in any execution starting from p'. 

The boundedness of M. implies that Pmr induces a connected subsystem. If Pmr = {r}, then 
we proved that p' S CCmr and we have the result. 

Otherwise, observe that, for any configuration of an execution starting from p' , if all processes 
of Pmr are not enabled, then all processes v of Pmr satisfy spec{v). Assume now that there exists 
an execution e starting from p' in which some processes of Pmr takes infinitely many steps. By 
construction, at least one of these processes (note it v) has a neighbor u which takes only a finite 
number of steps in e (recall that Pmr induces a connected subsystem and that r takes no step in 
e). After u takes its last step of e, we can observe that levelu = fnf and distu < D — 1 (otherwise, 
u is activated in a finite time that contradicts its construction). 

As V can execute consequently (-Ri) only a finite number of times (since the incrementation of 
disty is bounded by D), we can deduce that v executes (-R2) or (-R3) infinitely often. In both cases, 
u belongs to the set which is the parameter of function choose. By the fairness of this function, we 
can deduce that prnt^ = n in a finite time in e. Then, the construction of u implies that v is never 
enabled in the sequel of e. This is contradictory with the construction of e. 

Consequently, any execution starting from p' reaches in a finite time a configuration such that 
all processes of Pmr are not enabled. We can deduce that this configuration belongs to CCmr-, that 
ends the proof. D 
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Lemma 6 For any irii G M and for any configuration p € CCrm , o-ny execution of SSA4AX 
starting from, p reaches in a finite time a configuration such that: 

V-y € Irui, levelv = nii ^ dist^ = D 
Proof Let mi be an arbitrary metric value of M and po be an arbitrary configuration of CCrm- 
Let e = po) Pi) • • • be an execution starting from pQ. 

Note that pQ satisfies IM^m by construction. Hence, we have Vu € Imi,levely :< m,j. The closure 
of IM^m (proved in Lemma [2]) ensures us that this property is satisfied in any configuration of e. 

If any process v € Irm satisfies levels -< mi in pQ, then the result is obvious. Otherwise, we 
define the following variant function. For any configuration pj of e, we denote by Aj the set of 
processes v of Im such that levels = rrii in pj. Then, we define f{pj) = min{disty}. We will prove 

v£Aj 

the result by showing that there exists an integer k such that f{pk) = D. 

First, if a process v joins Aj (that is, v ^ ^j-i but v € Aj), then it takes a distance value 
greater or equals to /(pj+i) by construction of the protocol. We can deduce that the fact that 
some processes join Aj does not decrease /. Moreover, the construction of the protocol implies 
that a process v such that v S Aj and v S ^j+i can not decrease its distance value in the step 
Pj ^ pj+i. 

Then, consider for a given configuration pj a process v G Aj such that dist^ = f{pj) < D. We 
distinguish the following cases: 

Case 1: levels = met{levelprnty,Wy,prntJ 

The fact that v G 7^. , the boundedness of A4 and the closure of IMm^ imply that prnt^ € Aj 
(and, hence that levelpmu — ''^i)- Then, by construction of f{pj), we know that dist^ ^ 
distprnty + 1 (otherwise, we do not have dist^ = f{pj) since prnty has a smaller distance 
value). Consequently, v is enabled by (.Ri) in pj and disty increase of at least 1 during the 
step Pj 1-^ pj+i if this rule is executed. 

Case 2: levels / met{levelprnu,Wy^prnu) 

The rule (-Ri) is then enabled for v. If this rule is executed during the step pj H' Pj+i, one 
of the two following sub cases appears. 

Case 2.1: met{levelprnu,Wy^prnu) ^ mi 

Then, v does not belong to ^j+i by definition. 
Case 2.2: met{levelprnu,Wy^prnu) = mi 

Remind that the closure of IM^m implies then that levelprnu — "^i- By construction of 

f{pj), we have distpmu. ^ fiPj) ™ Pj- Then, we can see that disty increases of at least 

1 during the step pj i->- Pj+i- 

In all cases, v is enabled by (.Ri) in pj and the execution of this rule either increases strictly 
disty or removes v from Aj+i- 

As Imi is finite and the scheduling is weakly fair, we can deduce that / increases in a finite time 
in any execution starting from pj. By repeating the argument at most D times, we can deduce 
that e contains a configuration pk such that f{pk) = D, that shows the result. D 

Lemma 7 For any rui ^ M and for any configuration p € CCrm such that \/v € Irm ' ^^vely = rui^ 
disty = D, any execution of SSAiAX starting from p reaches in a finite time a configuration such 
that: 

Vu € Irm ' ^ef e/^ ■< m^i 
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Proof Let TTij G M be an arbitrary metric value and po be a configuration of CCrm such that 
yv G Irrii-, levely = rrii ^ dist^ = D. Let e = po, pi, . . . be an arbitrary execution starting from po- 

For any configuration pj of e, let us denote Ep. = {v € Irrn\levely = rrii}. By the closure of 
IMrm (which holds by definition in po) established in Lemma O we obtain the result if there exists 
a configuration pj of e such that Ep. = 0. 

If there exists some processes v € Irm \ Ep^ (and hence levels -< rrii) such that prnty G Ep^^ 
and met{levelprnu-,Wy^prnty) = "ij in po, then we can observe that these processes are continuously 
enabled by (-Ri). As the scheduling is weakly fair, v activates this rule in a finite time and 
then, levely = irii and disty = D. In other words, v joins Ep^ for a given integer /. We can 
conclude that there exists an integer k such that for any v € Irm \ -^po) either prnty ^ Ep^ or 

met{levelprnt^,Wy^prntJ -< 1^1- 

Then, we prove that, for any integer j > k, we have Ep.^^ Q Ep.. For the sake of contradiction, 

assume that there exists an integer j > k and a process v G Irm such that v E Ep+i and v ^ Ep.. 

Without loss of generality, assume that j is the smallest integer which performs these properties. 

Let us study the following cases: 

Case 1: If u activates (.Ri) during the step pj i— )> Pj+i^ then we know that prnty ^ Ep. in pj 
(otherwise, we have a contradiction with the fact that v G Ep.^.^). But in this case, we have: 
levelprnu ^ ^^i- The boundedness of A4 implies that levely -< uii in pj+i that contradicts 
the fact that v G Ep.^^. 

Case 2: If v activates either (-R2) or (-R3) during the step pj 1— > Pj+i, then v chooses a new 
parent which has a distance smaller than D — 1 in pj . This implies that this new parent does 
not belongs to Ep.. Then, we have levelpmty ^ tth- The boundedness of M implies that 
levely -< rrii in Pj+i that contradicts the fact that v £ Ep .^^ . 

In the two cases, our claim is satisfied. In other words, there exists a point of the execution 
afterwards the set E can not grow (this implies that, if a process leave the set E, it is a definitive 
leaving) . 

Assume now that there exists a step pj 1— t- Pj+i (with j > k) such that a process v £ Ep. is 
activated. Observe that the closure of IMrm implies that v can not be activated by the rule (-R3). 
If V activates (.Ri) during this step, then v modifies its level during this step (otherwise, we have 
a contradiction with the fact that levelpmu = "n^i ^ disty = D). The closure of IM.m. implies 
that V leaves the set E during this step. If v activates (-R2) during this step, then v chooses a new 
parent which has a distance smaller than D — 1 in pj. This implies that this new parent does not 
belongs to Ep.. Then, we have levelpmty -< i^i- The boundedness of M. implies that levely -< rui 
in Pj+i. In other words, if a process of Ep. is activated during the step pj 1-^ Pj+i, then it satisfies 

v^Ep^+i- 

Finally, observe that the construction of the protocol and the construction of the bound D 
ensures us that any process v G Irm such that disty = D is activated in a finite time. In conclusion, 
we obtain that there exists an integer j such that Ep. =0, that implies the result. D 

Lemma 8 For any mi G M and for any configuration p G CCrm, ^''^V execution of SSA4AX 

starting from p reaches in a finite time a configuration p' such that IMrm^i holds. 

Proof This result is a direct consequence of Lemmas [6] and [71 D 

Lemma 9 For any irii G M and for any configuration p G CCm-, any execution of SSA4AX 
starting from p reaches in a finite time a configuration of CCrm+i ■ 
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Proof Let rrii be a metric value of M and p be an arbitrary configuration of CCm- . We know by 
Lemma [8] that any execution starting from p reaches in a finite time a configuration p' such that 
IMm-^^ holds. By closure of IM and of CCm.. (established respectively in Lemma [2] and [3]), we know 
that any configuration of any execution starting from p' belongs to CCm- and satisfies IMm._^^. 

We know that T/m. ^ since r G Vm^ for any i > 0. Remind that Vmi+x is connected by the 
boundedness of ^A. Then, we know that there exists at least one process p of -Pmi+i which has a 
neighbor q in Vm^ such that p.(j>,r) = met{p{q,r),Wp^q). Moreover, Lemma [3] ensures us that any 
process of Vm^ takes no step in any executions tarting from p'. 

Observe that, for any configuration of an execution starting from p' , if all processes of Pnn+i are 
not enabled, then all processes v of P^i+i satisfy spec{v). Assume now that there exists an execution 
e starting from p' in which some processes of Prm+i take infinitely many steps. By construction, 
at least one of these processes (note it v) has a neighbor u such that p{v,r) = met{p{u,r),'Wy^u) 
which takes only a finite number of steps in e (recall the construction of p). After u takes its last 
step of e, we can observe that levelu = p{u, r) and distu < D — 1 (otherwise, u is activated in a 
finite time that contradicts its construction). 

As V can execute consequently (-Ri) only a finite number of times (since the incrementation of 
disty is bounded by D), we can deduce that v executes (-R2) or (-R3) infinitely often. In both cases, 
u belongs to the set which is the parameter of function choose (remind that IMrm^i is satisfied and 
that u has the better possible metric along v^s neighbors). By the construction of this function, we 
can deduce that prnt^ = u in a finite time in e. Then, the construction of u implies that v is never 
enabled in the sequel of e. This is contradictory with the construction of e. 

Consequently, any execution starting from p' reaches in a finite time a configuration such that 
all processes of -Pm,+i are not enabled. We can deduce that this configuration belongs to CCm^^^, 
that ends the proof. D 

Lemma 10 Starting from any configuration, any execution of SSMAX reaches a configuration 
of CC in a finite time. 

Proof Let p be an arbitrary configuration. We know by Lemma [5] that any execution starting 
from p reaches in a finite time a configuration of CCmr = CCmQ. Then, we can apply at most k 
times the result of Lemma [9] to obtain that any execution starting from p reaches in a finite time 
a configuration of CCm^ = ^^^ that proves the result. D 

Theorem 3 SSAdAX is a {SB,n — 1)-TA- strictly stabilizing protocol for spec. 

Proof This result is a direct consequence of Lemmas H] and [TOl D 

Note that Theorem [2] ensures us that Sb is the optimal containment area for a topology-aware 
strictly stabilizing protocol for spec. 

5 Conclusion 

We introduced a new notion of Byzantine containment in self-stabilization: the topology-aware 
strict stabilization. This notion relaxes the constraint on the containment radius of the strict 
stabilization to a containment area. In other words, the set of correct processes which may be 
infinitely often disturbed by Byzantine processes is a function depending on the topology of the 
system and on the actual location of Byzantine processes. We illustrated the relevance of this 
notion by providing a topology-aware strictly stabilizing protocol for the maximum metric tree 
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construction problem which does not admit strictly stabilizing solution. Moreover, our protocol 
performs the optimal containment area with respect to the topology-aware strict stabilization. 

Our work raises some opening questions. Number of problems do not accept strictly stabilizing 
solution. Does any of them admit a topology-aware strictly stabilizing solution ? Is it possible 
to give a necessary and/or sufficient condition for a problem to admit a topology-aware strictly 
stabilizing solution ? What happens if we consider only bounded Byzantine behavior ? 
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